Simple
Free
Flexible
Trendy
Open a new R script file (File > New File > R Script)
Console
Console
Script
Console
Script
Tracking panel
Console
Script
Tracking panel
Multipurpose panel
Check files in your computer, see plots, manage packages, read help section of a function.
Console
Script
Tracking panel
Multipurpose panel
Caution
Write everything you do in scripts to avoid loosing your work.
Tip
Solving errors is an important skill to learn.
Tip
Comment your script to help you remember what you have done.
Complex type
Complex type
Note
These are the basic complex types. It exists a lot of different complex objects which mix all these basic objects.
if, else, TRUE, FALSE)valid.name, valid_name, valid2name3valid name, valid-name, 1valid2name3Tip
Avoid random names such as var1, var2. Use significant names: gene_list, nb_elements
The name of the object followed by the assignment symbol and the value.
You can use operators on objects to modify them. Depending on the object format, operators have different behaviors and some are forbidden.
+-*/^ or **%/%%%Exercise
Correction
Exercise
Try to raise errors using operators.
function_name(object, parameter1 = ..., parameter2 = ...)help, ? or F1)Note
Some functions are in the default installation of R. Other functions come from packages. You can also create your own functions.
vector construction
c() Concatenate function1:10 Vector with numbers from 1 to 10vector construction
c() Concatenate function1:10 Vector with numbers from 1 to 10Extra
seq Create a sequence of numbersrep Repeat elements several timesrunif Simulate random numbers from Uniform distribution. Same for rnorm, rpois…Instructions
vector with 7 numeric valuesvector with 7 character valuesManipulation
Using index/position between []
Characterization
length() Number of elements in the vectornames() Get or set the names of the vector’s valueManipulation
sort() Sort a vectorsample() Shuffle a vectorrev() Reverse a vectorExtra
sort()/sample() Explore extra parametersorder() Get the index of the sorted elementsExploration
head()/tail() Print the first/last valuessummary() Summary statisticsmin()/max()/mean()/median()/var() Minimum, maximum, average, median, variancesum Sum of the vector’s valuesExtra
log/log2/log10 Logarithm functionssqrt Square-root functionArithmetic operators
Instructions
runif or rnorm)hist functionObjects
numeric, character and booleanvectors, lists, matrices and data.framesVectors
vector is a sequence of elements that are all of the same type[] are used to manipulate elements of a vectormatrix(): Creates a matrix from a vector
rbind()/cbind(): Binds multiple vectors of a same length to create a matrix
mat[i,j]: To select element at row i and column j. i and j can be vectors to select multiple elements.
t(): To transpose the matrix
rbind()/cbind(): To concatenate matrix vertically or horizontally
Instructions
dim(): Returns the dimension of the matrix, e.g. number of rows and columns rownames()/colnames(): Get or set the names of the rows/columns
length(), head(), tail()
For numeric matrix: min(), max(), sum(), mean(), summary()…
Arithmetic operations: +, -, *, /
Instructions
apply() functionYour new best friend
1 means row and 2 means columnsInstructions
Some functions are wrappers of apply():
rowSums() equivalent to apply(, 1, sum)colSums() equivalent to apply(, 2, sum)colMeans() equivalent to apply(, 1, mean)rowMeans() equivalent to apply(, 2, mean)A data frame is like a matrix but it can be composed of different data types.
data.frame() To create a data frame
as.data.frame() To transform a matrix into a data frame
[] or $name To select elements of a data.frame
t() To transpose a matrix
dim(): Returns the dimension of the data frame, e.g. number of rows and columns rownames()/colnames(): Get or set the names of the rows/columns rbind()/cbind(): To concatenate data frame vertically or horizontally
Instructions
You can use arithmetic operations (+-*/) and functions (apply() and others) as with a matrix.
Instructions
File path
To tell R where to find your data, you need to specify the path to the file. There are two types of paths for the same file:
Paths are always enclosed in double quotes (").
Working directory
Change the directory where R is working
``
Easy but important
read.table()
To read a data.frame from a multi-column file
file=the path to the fileheader= Set to TRUE if the 1st line correspond to column namesas.is= Set to TRUE to read the values as simple type, recommendedsep= The character that separate the columns, e.g. , or \t (tabs)row.names= The column number to use as row names ``Instructions
Import dataForBasicPlots.tsv into an object called mat.ge
mat.ge = read.table("data/dataForBasicPlots.tsv", as.is=TRUE, sep="\t", header=TRUE)
nrow(mat.ge) ## number of genes[1] 20000
[1] 100
sample1 sample2 sample3 sample4 sample5
gene1 0.000 0.000 0.000 0.000 0.000
gene2 0.000 0.000 0.000 0.000 0.000
gene3 1.036 0.383 -0.179 -0.478 0.125
gene4 0.000 0.000 0.000 0.000 0.000
gene5 0.605 1.214 2.598 1.208 2.539
write.table()
To write a data.frame in a multi-column file
df the data.frame to writefile= the file namecol.names= TRUE print the column names in the first linerow.names= TRUE print the row names in the first columnsquote= TRUE surround character by double quotessep= the character that separates each column. ’ ’ by default.saveRDS() Save one R object into a file. Use .rds extensionsave() Save multiple R objects into a file. Use .RData as extensionsave.image() Save the entire R environmentreadRDS() Read a R object from a .rds fileload() Load R objects from a .RData fileEasy way
Automatic way
pdf(), png(), jpeg()…)dev.off()In next session…
hist
Plot the value distribution of a vector. x The vector with the values to plot
plot
Plot one vector against the other. x The first vector to plot (x-axis) y The second vector to plot (y-axis) type How the points are plotted. “p” for points, “l” for points joined by lines
boxplot
Plot the distribution of variables. x The matrix of distributions
main= A title for te plot xlab=/ylab= A title for the x/y axis xlim=/ylim= A vector of size two defining the desired limits on the x/y axis
Extra parameters
col= The colour of the points/lines
pch= The shape of the points lty= The shape of the lines
Extra functions
lines() Same as plot but super-imposed to the existent one abline() Draw a vertical/horizontal line
Instructions
Generate plots for: 1. a boxplot of columns 1 to 10. 2. the distribution of the median gene expression. Add a vertical dotted ine to mark their average value. 3. the distribution of the median sample expression. If any visual outlier, remove it and check the distribution again. 4. the expression of gene 333 againt gene 666. Surimpose in red triangles the expression of gene 333 against gene 667.
Logical type
TRUE / FALSE values
== Are both values equal? > or >= Is left value greater (or equal) than right value? < or <= Is left value smaller (or equal) than right value? ! NOT operator, negates the value | OR operator, returns TRUE if either are TRUE & AND operator, returne TRUE if both are TRUE
Any logical test can be vectorized. which returns the index of the vectors with TRUE values
Instructions
Instructions
Write a if block that automatically classify the expression of the first gene into: - ‘high’ if its maximum value is higher than 4 - ‘low’ if not
mean(lunkyNumbers)Do your own
function() to define a functionreturn() specifies what will be returned by the functionThis function takes a vector as input and: - removes values lower than 3 - changes to 8 values higher than 8
:::
Instructions
Create a function that classify the average value of a vector. It returns: - small if the average is below 3 - medium if the average is between 3 and 7 - high if the average is above 7 Test your function on vectors with random numbers from 0 to 10.
Instructions
Create a function that: 1. returns the average of the minimum and maximum value of a vector 2. returns how many values are higher than 3 in a vector Test your function on vectors with random numbers from 0 to 10.
Instructions
Use your functions on all mat.gene data frame
Instructions
Instructions
Write a function that computes the mean values of the columns: - using the apply function - using a for loop - using a while loop
::: { fragment }
:::
paste() Pastes several characters into one grep() Searches for a pattern in a vector and returns the index when matched grepl() Searches a pattern in a vector and returns TRUE if found strsplit() Splits a string
::: { fragment }
:::
Instructions
Write R commands to address each question. Only one_line command allowed. The shorter the better. 1. From a matrix of numeric, compute the proportion of columns with average value higher than 0. 2. From a matrix of numeric, print the name of the column with the highest value. 3. From a matrix of numeric, print the rows with only positive values.
::: { fragment }
:::